{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "# Core Colang Concepts\n",
    "\n",
    "This guide builds on the [Hello World guide](../1_hello_world/README.md) and introduces the core Colang concepts you should understand to get started with NeMo Guardrails."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "# Init: copy the previous config.\n",
    "!cp -r ../1_hello_world/config ."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Prerequisites\n",
    "\n",
    "This \"Hello World\" guardrails configuration uses the OpenAI `gpt-3.5-turbo-instruct` model.\n",
    "\n",
    "1. Install the `openai` package:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "!pip install openai"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "2. Set the `OPENAI_API_KEY` environment variable:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "!export OPENAI_API_KEY=$OPENAI_API_KEY  # Replace with your own key"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "3. If you're running this inside a notebook, patch the AsyncIO loop."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## What is Colang?\n",
    "\n",
    "Colang is a modeling language for conversational applications. Use Colang to design how the conversation between a user and a bot should happen.\n",
    "\n",
    "> **NOTE**: throughout this guide, bot means the entire LLM-based Conversational Application.\n",
    "\n",
    "## Core Concepts\n",
    "\n",
    "In Colang, the two core concepts are *messages* and *flows*.\n",
    "\n",
    "### Messages\n",
    "\n",
    "In Colang, a conversation is modeled as an exchange of messages between a user and a bot. An exchanged message has an *utterance*, such as *\"What can you do?\"*, and a *canonical form*, such as `ask about capabilities`. A canonical form is a paraphrase of the utterance to a standard, usually shorter, form.\n",
    "\n",
    "Using Colang, you can define the user messages that are important for your LLM-based application. For example, in the \"Hello World\" example, the `express greeting` user message is defined as:\n",
    "\n",
    "```\n",
    "define user express greeting\n",
    "  \"Hello\"\n",
    "  \"Hi\"\n",
    "  \"Wassup?\"\n",
    "```\n",
    "\n",
    "The `express greeting` represents the canonical form and \"Hello\", \"Hi\" and \"Wassup?\" represent example utterances. The role of the example utterances is to teach the bot the meaning of a defined canonical form.\n",
    "\n",
    "You can also define bot messages, such as how the bot should converse with the user. For example, in the \"Hello World\" example, the `express greeting` and `ask how are you` bot messages are defined as:\n",
    "\n",
    "```\n",
    "define bot express greeting\n",
    "  \"Hey there!\"\n",
    "\n",
    "define bot ask how are you\n",
    "  \"How are you doing?\"\n",
    "```\n",
    "\n",
    "If more than one utterance is given for a canonical form, the bot uses a random utterance whenever the message is used."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "If you are wondering whether *user message canonical forms* are the same as classical intents, the answer is yes. You can think of them as intents. However, when using them, the bot is not constrained to use only the pre-defined list."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### Flows\n",
    "\n",
    "In Colang, *flows* represent patterns of interaction between the user and the bot. In their simplest form, they are sequences of user and bot messages. In the \"Hello World\" example, the `greeting` flow is defined as:\n",
    "\n",
    "```colang\n",
    "define flow greeting\n",
    "  user express greeting\n",
    "  bot express greeting\n",
    "  bot ask how are you\n",
    "```\n",
    "\n",
    "This flow instructs the bot to respond with a greeting and ask how the user is feeling every time the user greets the bot."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Guardrails\n",
    "\n",
    "Messages and flows provide the core building blocks for defining guardrails, or rails for short. The previous `greeting` flow is in fact a rail that guides the LLM how to respond to a greeting.\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## How does it work?\n",
    "\n",
    "This section answers the following questions:\n",
    "\n",
    "- How are the user and bot message definitions used?\n",
    "- How is the LLM prompted and how many calls are made?\n",
    "- Can I use bot messages without example utterances?\n",
    "\n",
    "Let's use the following greeting as an example."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello World!\n",
      "How are you doing?\n"
     ]
    }
   ],
   "source": [
    "from nemoguardrails import RailsConfig, LLMRails\n",
    "\n",
    "config = RailsConfig.from_path(\"./config\")\n",
    "rails = LLMRails(config)\n",
    "\n",
    "response = rails.generate(messages=[{\n",
    "    \"role\": \"user\",\n",
    "    \"content\": \"Hello!\"\n",
    "}])\n",
    "print(response[\"content\"])"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:17.081380Z",
     "start_time": "2023-11-29T15:56:10.821200Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### The `ExplainInfo` class\n",
    "\n",
    "To get information about the LLM calls, call the **explain** function of the `LLMRails` class."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "outputs": [],
   "source": [
    "# Fetch the `ExplainInfo` object.\n",
    "info = rails.explain()"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:17.095649Z",
     "start_time": "2023-11-29T15:56:17.080878Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### Colang History\n",
    "\n",
    "Use the `colang_history` function to retrieve the history of the conversation in Colang format. This shows us the exact messages and their canonical forms:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "user \"Hello!\"\n",
      "  express greeting\n",
      "bot express greeting\n",
      "  \"Hello World!\"\n",
      "bot ask how are you\n",
      "  \"How are you doing?\"\n"
     ]
    }
   ],
   "source": [
    "print(info.colang_history)"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:17.096011Z",
     "start_time": "2023-11-29T15:56:17.084868Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### LLM Calls\n",
    "\n",
    "Use the `print_llm_calls_summary` function to list a summary of the LLM calls that have been made:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Summary: 1 LLM call(s) took 0.48 seconds and used 524 tokens.\n",
      "\n",
      "1. Task `generate_user_intent` took 0.48 seconds and used 524 tokens.\n"
     ]
    }
   ],
   "source": [
    "info.print_llm_calls_summary()"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:17.096161Z",
     "start_time": "2023-11-29T15:56:17.088974Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "The `info` object also contains an `info.llm_calls` attribute with detailed information about each LLM call. That attribute is described in a subsequent guide."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### The process\n",
    "\n",
    "Once an input message is received from the user, a multi-step process begins.\n",
    "\n",
    "### Step 1: Compute the canonical form of the user message\n",
    "\n",
    "After an utterance, such as  \"Hello!\" in the previous example, is received from the user, the guardrails instance uses the LLM to compute the corresponding canonical form.\n",
    "\n",
    "> **NOTE**: NeMo Guardrails uses a task-oriented interaction model with the LLM. Every time the LLM is called, it uses a specific task prompt template, such as `generate_user_intent`, `generate_next_step`, `generate_bot_message`. See the [default template prompts](../../../nemoguardrails/llm/prompts/general.yml) for details.\n",
    "\n",
    "In the case of the \"Hello!\" message, a single LLM call is made using the `generate_user_intent` task prompt template. The prompt looks like the following:\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\"\"\"\n",
      "Below is a conversation between a helpful AI assistant and a user. The bot is designed to generate human-like text based on the input that it receives. The bot is talkative and provides lots of specific details. If the bot does not know the answer to a question, it truthfully says it does not know.\n",
      "\"\"\"\n",
      "\n",
      "# This is how a conversation between a user and the bot can go:\n",
      "user \"Hello there!\"\n",
      "  express greeting\n",
      "bot express greeting\n",
      "  \"Hello! How can I assist you today?\"\n",
      "user \"What can you do for me?\"\n",
      "  ask about capabilities\n",
      "bot respond about capabilities\n",
      "  \"As an AI assistant, I can help you with a wide range of tasks. This includes question answering on various topics, generating text for various purposes and providing suggestions based on your preferences.\"\n",
      "user \"Tell me a bit about the history of NVIDIA.\"\n",
      "  ask general question\n",
      "bot response for general question\n",
      "  \"NVIDIA is a technology company that specializes in designing and manufacturing graphics processing units (GPUs) and other computer hardware. The company was founded in 1993 by Jen-Hsun Huang, Chris Malachowsky, and Curtis Priem.\"\n",
      "user \"tell me more\"\n",
      "  request more information\n",
      "bot provide more information\n",
      "  \"Initially, the company focused on developing 3D graphics processing technology for the PC gaming market. In 1999, NVIDIA released the GeForce 256, the world's first GPU, which was a major breakthrough for the gaming industry. The company continued to innovate in the GPU space, releasing new products and expanding into other markets such as professional graphics, mobile devices, and artificial intelligence.\"\n",
      "user \"thanks\"\n",
      "  express appreciation\n",
      "bot express appreciation and offer additional help\n",
      "  \"You're welcome. If you have any more questions or if there's anything else I can help you with, please don't hesitate to ask.\"\n",
      "\n",
      "\n",
      "# This is how the user talks:\n",
      "user \"Wassup?\"\n",
      "  express greeting\n",
      "\n",
      "user \"Hi\"\n",
      "  express greeting\n",
      "\n",
      "user \"Hello\"\n",
      "  express greeting\n",
      "\n",
      "\n",
      "\n",
      "# This is the current conversation between the user and the bot:\n",
      "# Choose intent from this list: express greeting\n",
      "user \"Hello there!\"\n",
      "  express greeting\n",
      "bot express greeting\n",
      "  \"Hello! How can I assist you today?\"\n",
      "user \"What can you do for me?\"\n",
      "  ask about capabilities\n",
      "bot respond about capabilities\n",
      "  \"As an AI assistant, I can help you with a wide range of tasks. This includes question answering on various topics, generating text for various purposes and providing suggestions based on your preferences.\"\n",
      "user \"Hello!\"\n"
     ]
    }
   ],
   "source": [
    "print(info.llm_calls[0].prompt)"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:17.100528Z",
     "start_time": "2023-11-29T15:56:17.092069Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "The prompt has four logical sections:\n",
    "\n",
    "1. A set of general instructions. These can be [configured](../../user_guides/configuration-guide.md#general-instructions) using the `instructions` key in *config.yml*.\n",
    "\n",
    "2. A sample conversation, which can also be [configured](../../user_guides/configuration-guide.md#sample-conversation) using the `sample_conversation` key in *config.yml*.\n",
    "\n",
    "3. A set of examples for converting user utterances to canonical forms. The top five most relevant examples are chosen by performing a vector search against all the user message examples. For more details see [ABC Bot](../../../examples/bots/abc).\n",
    "\n",
    "4. The current conversation preceded by the first two turns from the sample conversation."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "For the `generate_user_intent` task, the LLM must predict the canonical form for the last user utterance."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  express greeting\n"
     ]
    }
   ],
   "source": [
    "print(info.llm_calls[0].completion)"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:17.142561Z",
     "start_time": "2023-11-29T15:56:17.099106Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "As we can see, the LLM correctly predicted the `express greeting` canonical form. It even went further to predict what the bot should do, which is `bot express greeting`, and the utterance that should be used. However, for the `generate_user_intent` task, only the first predicted line is used. If you want the LLM to predict everything in a single call, you can enable the [single LLM call option](#) in *config.yml* by setting the `rails.dialog.single_call` key to **True**.\n",
    "\n",
    "### Step 2: Determine the next step\n",
    "\n",
    "After the canonical form for the user message has been computed, the guardrails instance needs to decide what should happen next. There are two cases:\n",
    "\n",
    "1. If there is a flow that matches the canonical form, then it is used. The flow can decide that the bot should respond with a certain message, or execute an action.\n",
    "\n",
    "2. If there is no flow, the LLM is prompted for the next step using the `generate_next_step` task.\n",
    "\n",
    "In our example, there was a match from the `greeting` flow and the next steps are:\n",
    "\n",
    "```\n",
    "bot express greeting\n",
    "bot ask how are you\n",
    "```\n",
    "\n",
    "### Step 3: Generate the bot message\n",
    "\n",
    "Once the canonical form for what the bot should say has been decided, the message must be generated. There are two cases:\n",
    "\n",
    "1. If a predefined message is found, the exact utterance is used. If more than one example utterances are associated with the same canonical form, a random one is used.\n",
    "\n",
    "2. If a predefined message does not exist, the LLM is prompted to generate the message using the `generate_bot_message` task. \n",
    "\n",
    "In our \"Hello World\" example, the predefined messages \"Hello world!\" and \"How are you doing?\" are used."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## The follow-up question\n",
    "\n",
    "In the previous example, the LLM is prompted once. The following figure provides a summary of the outlined sequence of steps:\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"../../_assets/puml/core_colang_concepts_fig_1.png\" width=\"486\">\n",
    "</div>\n",
    "\n",
    "Let's examine the same process for the follow-up question \"What is the capital of France?\"."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The capital of France is Paris.\n"
     ]
    }
   ],
   "source": [
    "response = rails.generate(messages=[{\n",
    "    \"role\": \"user\",\n",
    "    \"content\": \"What is the capital of France?\"\n",
    "}])\n",
    "print(response[\"content\"])"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:18.958381Z",
     "start_time": "2023-11-29T15:56:17.101998Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Let's check the colang history:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "user \"What is the capital of France?\"\n",
      "  ask general question\n",
      "bot response for general question\n",
      "  \"The capital of France is Paris.\"\n"
     ]
    }
   ],
   "source": [
    "info = rails.explain()\n",
    "print(info.colang_history)"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:18.961599Z",
     "start_time": "2023-11-29T15:56:18.958549Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "And the LLM calls:"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Summary: 3 LLM call(s) took 1.79 seconds and used 1374 tokens.\n",
      "\n",
      "1. Task `generate_user_intent` took 0.63 seconds and used 546 tokens.\n",
      "2. Task `generate_next_steps` took 0.64 seconds and used 216 tokens.\n",
      "3. Task `generate_bot_message` took 0.53 seconds and used 612 tokens.\n"
     ]
    }
   ],
   "source": [
    "info.print_llm_calls_summary()"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-11-29T15:56:18.965009Z",
     "start_time": "2023-11-29T15:56:18.961386Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Based on these steps, we can see that the `ask general question` canonical form is predicted for the user utterance \"What is the capital of France?\". Since there is no flow that matches it, the LLM is asked to predict the next step, which in this case is `bot response for general question`. Also, since there is no predefined response, the LLM is asked a third time to predict the final message.\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"../../_assets/puml/core_colang_concepts_fig_2.png\" width=\"686\">\n",
    "</div>\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Wrapping up\n",
    "\n",
    "This guide provides a detailed overview of two core Colang concepts: *messages* and *flows*. It also looked at how the message and flow definitions are used under the hood and how the LLM is prompted. For more details, see the reference documentation for the [Python API](../../user_guides/python-api.md) and the [Colang Language Syntax](../../user_guides/colang-language-syntax-guide.md).\n",
    "\n",
    "## Next\n",
    "\n",
    "The next guide, [Demo Use Case](../3_demo_use_case), guides you through selecting a demo use case to implement different types of rails, such as for input, output, or dialog."
   ],
   "metadata": {
    "collapsed": false
   }
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
